Introduction to Modesummary package

1 Before Starting

Make sure you are turn on the “Render on Save”. This let you see see changes of the quarto document you are working on without having to re-render the output file every time you save this file (Cmd + s for MAC users, Ctrl+S for Windows users).

1.1 Lerning Objectives:

  • By the end of this section, you know how to use the modelsummary package to create regression and summary tables that are of publication quality.

1.2 Data

data(CPS1988)
# I prefer to conver the data to data.table. 
setDT(CPS1988)

2 Introduction to modelsummary package

2.1 Intoductoin

modelsummary package let you create a nice summary table to report the discriptive statistics of the data and the regression results.

Today, we mainly use two functions in the modelsummary package:

  • datasummary(): to create a summary table for the descriptive statistics of the data.
  • modelsummary(): to create a summary table for the regression results.

Check the documentation for more details.

Note
  • There is another package called stargazer that can create a summary table, but it is not maintained anymore. So, I recommend to use modelsummary package.
  • modelsummary package is compatible with

2.2 The Taste of modelsummary package

In the below, I show what kind of tables can be created with modelsummary package. Don’t try to understand the code for now. See the output tables.

Code
datasummary(
  wage + education + experience ~ Mean + SD + Min + Max,
  data = CPS1988
  )
Table 1: Example of Summary Statistics
tinytable_636rzkogo9bdy95nrehf
Mean SD Min Max
wage 603.73 453.55 50.05 18777.20
education 13.07 2.90 0.00 18.00
experience 18.20 13.08 -4.00 63.00
Code
# change the base group for ethnicity to "cauc"
ex_dt <-
  copy(CPS1988) %>%
  .[,ethnicity := relevel(as.factor(ethnicity), ref = "cauc")]

ls_regs <- 
  list(
    "OLS 1" = lm(log(wage) ~ education, data = ex_dt),
    "OLS 2" = lm(log(wage) ~ education + experience + I(experience^2), data = ex_dt),
    "OLS 3" = lm(log(wage) ~ education + experience + I(experience^2) + ethnicity, data = ex_dt)
  )

modelsummary(
  models = ls_regs,
  output = "latex",
  coef_map = c(
    "education" = "Education", 
    "experience" = "Experience", 
    "I(experience^2)" = "Experience squared",
    "ethnicityafam" = "White"
    ),
  stars  =  c("*" = .05, "**" = .01, "***" = .001), 
  gof_map = c("nobs", "r.squared",  "adj.r.squared"),
  notes = list("Std. Errors in parentheses")
  )
Table 2: Example regression results
tinytable_l1ebuf8jedhtep1q222i
OLS 1 OLS 2 OLS 3
* p < 0.05, ** p < 0.01, *** p < 0.001
Std. Errors in parentheses
Education \num{0.076}*** \num{0.087}*** \num{0.086}***
(\num{0.001}) (\num{0.001}) (\num{0.001})
Experience \num{0.078}*** \num{0.077}***
(\num{0.001}) (\num{0.001})
Experience squared \num{-0.001}*** \num{-0.001}***
(\num{0.000}) (\num{0.000})
White \num{-0.243}***
(\num{0.013})
Num.Obs. \num{28155} \num{28155} \num{28155}
R2 \num{0.095} \num{0.326} \num{0.335}
R2 Adj. \num{0.095} \num{0.326} \num{0.335}

2.3 Regression Tables with modelsummary() function

Let’s start with the modelsummary() function to create a summary table for the regression results.

2.4 modelsummary() function

2.4.1 Basics

Syntax

The basic argument of the modelsummary() function is the list of regression models you want to report in the table.

modelsummary(models=list(model1, model2, model3))

Example

reg1 <- lm(log(wage) ~ education, data = CPS1988)
reg2 <- lm(log(wage) ~ education + experience + I(experience^2), data = CPS1988)

modelsummary(models=list(reg1, reg2))
tinytable_fhahj0ymqmoc3g7ai2n7
(1) (2)
(Intercept) 5.178 4.278
(0.019) (0.019)
education 0.076 0.087
(0.001) (0.001)
experience 0.078
(0.001)
I(experience^2) -0.001
(0.000)
Num.Obs. 28155 28155
R2 0.095 0.326
R2 Adj. 0.095 0.326
AIC 405753.0 397432.7
BIC 405777.7 397473.9
Log.Lik. -29139.853 -24977.715
F 2941.787 4545.929
RMSE 0.68 0.59
modelsummary(
  models = list("OLS 1" = reg1, "OLS 2" = reg2),
  coef_map = c(
    "education" = "Education", 
    "experience" = "Experience", 
    "I(experience^2)" = "Experience squared"
    ),
  stars  =  c("*" = .05, "**" = .01, "***" = .001)
  # coef_omit = 1
  )
tinytable_x2riwn2h6u2qjz81w2bl
OLS 1 OLS 2
* p < 0.05, ** p < 0.01, *** p < 0.001
Education 0.076*** 0.087***
(0.001) (0.001)
Experience 0.078***
(0.001)
Experience squared -0.001***
(0.000)
Num.Obs. 28155 28155
R2 0.095 0.326
R2 Adj. 0.095 0.326
AIC 405753.0 397432.7
BIC 405777.7 397473.9
Log.Lik. -29139.853 -24977.715
F 2941.787 4545.929
RMSE 0.68 0.59
msummary(
  models = list("OLS 1" = reg1, "OLS 2" = reg2),
  coef_map = c(
    "education" = "Education", 
    "experience" = "Experience", 
    "I(experience^2)" = "Experience squared"
    ),
  stars  =  c("*" = .05, "**" = .01, "***" = .001)
  )
tinytable_h3cknbg056ug2osb2blv
OLS 1 OLS 2
* p < 0.05, ** p < 0.01, *** p < 0.001
Education 0.076*** 0.087***
(0.001) (0.001)
Experience 0.078***
(0.001)
Experience squared -0.001***
(0.000)
Num.Obs. 28155 28155
R2 0.095 0.326
R2 Adj. 0.095 0.326
AIC 405753.0 397432.7
BIC 405777.7 397473.9
Log.Lik. -29139.853 -24977.715
F 2941.787 4545.929
RMSE 0.68 0.59